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In Pure Inductive Logic, the rational principle of Predicate Exchangeability 
states that permuting the predicates in a given language L and replacing each 
occurrence of a predicate in an L-sentence tp according to this permutation 
should not change our belief in the truth of ip. In this paper we study when 
a probability function w on a purely unary language L satisfying Predicate 
Exchangeability also satisfies the principle of Unary Language Invariance. 
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1 Introduction 



In the study of logical probability in the sense of Carnap's Inductive Logic programme, 
PQ, [2], the notion of symmetry plays a leading role. In the assignment of beliefs, as 
subjective probabilities, it seems logical, or rational, to observe prevailing symmetries, a 
typical example being the perceived fairness of a coin toss, at least in the absence of any 

*Submittcd to the Proceedings of the 1st Reasoning Club Meeting, eds. J. P. van Bendegem, J.Murzi, 
University Foundation, Brussels, 2012, appearing in Logique et Analyse. 
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inside knowledge to the contrary. For this reason a number of rational principles have 
been proposed in Inductive Logic which are based on invariance under various notions 
of symmetry, principles which it is argued a choice of logical or rational (we use these 
two words synonymously) probability function should satisfy. The most prevailing of 
these, accepted by both the founding fathers of Inductive Logic, W.E. Johnson [10J, 
and Rudolf Carnap [3j, is that the names we give things, in particular constants and 
predicates, should not matter when it comes to assigning probabilities. Thus, since 
interchanging which side of the coin we call heads and which we call tails does not 
change what we understand by a coin toss, both outcomes should rationally receive the 
same probability. 

A second, ubiquitous, rational principle is that when assigning rational probabilities 
'irrelevant information' can be disregarded. Indeed the central principle of Johnson 
and Carnap, the so called Johnson's Sufficientness Postulate, is just such an example. 
Just as with saying what exactly we might mean by a 'symmetry' this directive does 
of course raise the question of what exactly we mean by an 'irrelevance information', 
and numerous interpretations have been mooted, generally based on the idea that such 
information is expressed in a disjoint, or partially disjoint language. 

A third, more recent and rather overarching, rational principle is the requirement of 
language invariance. By that we mean that to be rational a probability function should 
not be restricted to one special language but be extendable to larger languages, and 
furthermore that those additional rational principles which we imposed in the context 
of the original language should also be satisfied by these extensions. 

In this paper we shall study two symmetry principles, Constant Exchangeability and 
Predicate Exchangeability, in the presence of language invariance with the main goal 
of providing a representation theorem along the lines of de Finetti's Representation 
Theorem for Constant Exchangeability alone, see for example [5], jllj . Although rather 
technical, at least in relation to the seemingly elementary mathematics at the heart of 
Inductive Logic, such results have, starting with Gaifman j6] and Humburg [9], been 
an extremely powerful tool in our understanding of the interrelationship between the 
various rational principles which have been proposed. Hopefully the results given here 
will also find similar applications in the future. 

The structure of this paper is as follows. In Section 2 we shall introduce the notation and 
give precise formulations of the main principles we shall be studying. In Section 3 we shall 
provide a representation theorem for probability functions satisfying language invariance 
with Constant and Predicate Exchangeability assuming a particularly strong irrelevance 
condition, the Constant Irrelevance Principle, and in the next section show a similar 
result without this assumption. This latter representation theorem shows that all such 
probability functions are in a sense convex mixtures of probability functions satisfying 
the so called Weak Irrelevance Principle, and conversely. Finally in Section 5 we will 



Johnson's Permutation Postulate and Carnap's Axiom of Symmetry. 
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give a general representation theorem for probability functions satisfying Constant and 
Predicate Exchangeability alone, showing that they are mixtures (not necessarily convex) 
of such probability functions which additionally satisfy language invariance. 

The philosophical standpoint of this paper is Pure Inductive Logic, see [TT], [12], a branch 
of Carnap's Inductive Logic which he already described in [3J. Thus we shall be interested 
in studying logical probability without relation to specific interpretations. Of course the 
rational principles one proposes may have their genesis in real world examples but once 
a principle is formulated it is studied in Pure Inductive Logic through the agency of 
mathematics. The subsequent interest within philosophy lies, we would opine, mainly 
in considering what these mathematical conclusions tell us about the original and like 
motivating examples. 



2 Notation and Principles 

We will be working in the usual context of (unary) Pure Inductive Logic. Thus the 
first order languages we will be concerned with consist only of finitely many unary 
predicate symbols Pi and countably many constant symbolgj ai, a 2 , . . . , a m , . . . , which 
should be thought of as exhausting the universe. We will write L q to indicate the 
language containing just the predicates Pi, ... , P q . Let SL denote the set of sentences 
of the language L, QFSL the set of quantifier- free sentences of L. 

An atom a(x) of L is a formula 

P^{x) A P| 2 (x) A ■ ■ ■ A P £ q q (x), 

with G {0,1} and P/(x), Pf{x) standing for Pj(x), -iPj(x), respectively!! Note that 
for L containing q predicates there are 2 q atoms, which we shall denote ai, . . . , olw 

A state description of L foiQ Oj 1 , . . . , a in is a sentence 

n 

9{a h , . . .,a in ) = f\ a hj (a i:j ), 

3=1 

where hj G {1, . . . , 2 9 } for j — 1, . . . , n. 

A probability function on L is a function w : SL — >■ [0, 1] satisfying the following condi- 
tions for all (p, 3x tp(x) G SL: 

(PI) If |= ft, then w($) = 1. 

2 For convenience, we shall henceforth refer to these just as 'predicates' and 'constants'. 
In the literature, the notation ±Pi(x) is more common; however, in the scope of this paper, the 

notation (x) is more convenient. 
4 The entries in such lists will be taken to be distinct unless otherwise stated. 
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(P2) If •& |= -n(p, then w{-d V <p) 



(P3) w;(Eb VK 2 ')) = lim. 



The following theorem will allow us to restrict our studies to quantifier-free sentences. 

Theorem 1 (Gaifman, [7j). Letw : QFSL — > [0, 1] be a function satisfying (PI), (P2) 
for all i?, ip G QFSL. Then there exists a unique w' : SL — >■ [0,1] satisfying (P1)-(P3) 
extending w. 

Since any quantifier-free sentence of L is logically equivalent to a disjunction of state 
descriptions, by (P2) and Theorem [T] a probability function is determined by its values 
on the state descriptions. Let 



Then we obtain an example of a probability function by defining w$ on state descriptions 



where rij = \{j \ hj — 

These functions are quite important examples, as they form the building blocks in de 
Finetti's Representation Theorem. Before stating this theorem, we need to introduce 
the Principle of Constant Exchangeability: 

The Principle of Constant Exchangeability, Ex 

A probability function w on SL satisfies Constant Exchangeability if for each 
(p(ai, . . . , a n ) G SL, and a a permutation of N + (= {1, 2, 3, . . .}), 



Notice that the wg satisfy Ex. Ex is such a well accepted principle in Inductive Logic that 
we shall henceforth take it as a standing assumption throughout that all the probability 
functions we consider satisfy it. 

We shall therefore not mention the particular constants whenever they are understood 
from the context. 

Theorem 2 (de Finetti's Representation Theorem). Let L = L q andw be a prob- 
ability function on SL satisfying Ex. Then there exists a normalized, a-additive measure 




via 



1^(9(0^, ...,ai 



)) 



w{ip{ai, a n )) = iy(^(o ff (i), . . . , a CT(n) )). 
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jj, on the Borel subsets 0/IB29 such that 




Conversely, given such a measure \l, the junction w defined by is a probability func- 
tion on SL satisfying Ex. 

It is straightforward to show (see [12]) that these wg are characterized as those probab- 
ility functions which satisfy Ex together with 

The Principle of Constant Irrelevance, IP 

A probability function w on SL satisfies Constant Irrelevance if for ip e QFSL 
with no constants in common, 

w("d A if) = u>(i?) • w((p). 



Thus de Finetti's Representation Theorem can be alternately stated as saying that every 
probability function satisfying Ex is a convex mixture of probability functions satisfying 
IP, and conversely. 

The principles that are of particular interest to us in this paper are: 
The Principle of Predicate Exchangeability, Px 

A probability function w on SL satisfies Predicate Exchangeability if whenever p e SL 
and if' is the result of replacing the predicate^ P^, . . . , Pj m in (p by P^, . . . , Pk m , then 



The Principle of Unary Language Invariance, ULi 

A probability function w on SL satisfies Unary Language Invariance if there exists a 
family of probability functions w c , one for each finite (unary) language C, satisfying 
Px (and by standing assumption Ex), such that w = w L and whenever C C £, then 
w c = w c r SC, the restriction of w to the sentences of C. 

We say that w satisfies ULi with V (for some principle V ), if each of the functions w c 
satisfy V . 

Notice that if w c ,w c ' are members of a language invariant family and £, £' have the 
same number of predicates then w c is the same as w c up to renaming predicates. For 



5 In such lists we shall always assume that the members are distinct. 
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that reason it will, for the most part, be enough for us to focus our attention on the 
members w c of the family when C = L q for some q. 

This also illustrates the motivation for pairing ULi with Px; for if we were to drop Px 
from the definition, then w c would depend on the particular set of predicates in C, and 
we would be imposing some a priori semantics on the languages^ 

Given a permutation a of the predicates of L, there is a unique permutation of the atoms 
of L that is induced by a: For a(x) = Ai=i P^i 00 ) an a t° m of L, let aa(x) be the atom 
given by 

q 

aa(x) = /\a{Pj ei (x). 
i=i 

This now in turn induces a permutation on SL in the obvious way. Abusing notation, 
we identify these permutations of atoms and L-sentences with a. We shall write a is 
induced by Px to indicate that a arises from a permutation of predicates. 



3 A First Representation Theorem 

Since the w$ are the building blocks for probability functions satisfying Ex (see de 
Finetti's Theorem above), these functions are of special interest to us. We will there- 
fore begin by studying when they satisfy ULi, equivalently when probability functions 
satisfying Ex and IP satisfy ULi. 

Suppose a probability function w on some language L satisfied Predicate Exchangeabil- 
ity. Then the probability that w assigns any atom a of L only depends on the number 
of predicates in a that occur neg atedQ To see this notice that if a, a' are atoms then 
a' can be obtained from a by a permutation of predicates just if both atoms have the 
same number of negated predicates. 

It is thus convenient to introduce a function assigning each atom the corresponding 
number of predicates: 

Definition 3: 

Let L = L q . Define j q : {1, . . . , 2«} -»■ {0, . . . , q} by 

j q {i) = k <^ oti contains k negated predicates. 

We shall drop the index q whenever it is understood from the context. 

6 In fact, as one easily checks, without Px, all of the wg functions can be extended to obtain a language 
invariant family, and the choices are arbitrary on every level, which makes Language Invariance in 
this form a trivial statement. 

7 This is an arbitrary choice. One could also count the number of predicates that occur positively in 
a, as the argument is symmetrical. 
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Now considering c G 3 2q it follows that wg satisfies Predicate Exchangeability if and only 
if q = Cj whenever 7(2) = 'j(j). With this in mind we shall assume that our enumeration 
of the atoms is such that the number of negated predicates is non-decreasing as we move 
right through a±, a 2 , ■ ■ ■ , cx 2 t. Since for each i G {0, . . . , q) there are (^) atoms of L q 
with i predicates occurring negatively we therefore have that for wg satisfying Px 

C = (CO, Cl, ■ ■ ■ , Cl, C 2 , ■ ■ ■ , C 2 , ■ ■ ■ , C q _l, . . . , Cg^l, Cg), 

i.e. q = C 7 (j) for i = 1, 2, . . . , 2 9 , and 

1 / 



j=0 



Thus any such c gives us a unique C — (Co, Ci, C 2 , . . . , C 9 ) with the properties 

Vi G {0, . . . , g} Cj > and 1 = ^ (YW 

i=o W 



Conversely, any C with these properties provides a unique c G ©2? such that wg satisfies 
Px, giving us a 1-1 correspondence between these c G B> 2 q and the elements of 



"q ■- 



jc= (C ,Ci,C 2 ,...,C,)|Vi G {0,..., 9 }Cj > Oand 1 = 



(2) 



We shall refer to elements of the set above as the alternative notation for such a c G D 2 <?. 

Given an atom a of L q , we can view this atom as a quantifier- free sentence in the 
extended language L q+1 , and obtain 

a(x) = a + (:r) V aT(x) = A P q+1 (x)) V A -iP q+1 (x)) . 

Now suppose c G Tt> 2q , d G D29+ 1 are such that wj \ SL q = wg and both satisfy Px. Then 
by the logical equivalence given above, we must have 

w s (a) = w s (a) = wj(a + ) + w s (pT). 

Suppose C G D 9 , V G are the corresponding alternative notations for c and d. Then 
we obtain for each i G {0, . . . , q}, 

Ct = T>i + T> i+ i. 
The following proposition generalizes this to ULi families. 

Proposition 4. Let wg be a probability function on L q . Suppose wg is a member of a 
ULi with IP family W and assume wj G W is a probability function on L r for some 
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r > q. Let C,V be the corresponding alternative notations for c, d. Then for each 
j G {0, . . . , q}, we have 



r-q+j 

r — q 



Proof: We show this by induction on s := r — q. In case s = 1, we have for each 

j e {0,...,g}, 

c j = v j + v j+1 , 

since for a an atom of L q with j negated predicates, we have in L r (= L q+ i) 

a = a + V a" , 

where a + , a~ are atoms of L r with j,j + 1 negated predicates, respectively. 

Now let s = p + 1 and assume the result holds for p. Let X>- denote the corresponding 
values for the atoms of L q+P . By the inductive hypothesis we have 

Just as in the case s = 1 we have V' k = V k + P^+i for each < k < q + p, so we obtain 

P+j / \ P+l+i / . \ r-q+j , \ 

^-E( t !>^)-EC;:>-i:(;:;K 

as required. -| 

With this proposition in mind, we are ready to proceed to the first Representation 
Theorem. 

Theorem 5. Let c G D 2 ? and w? be a probability function satisfying Px. Then is a 
member of a ULi with IP family W = {w^ \ d r G D 2 r } if and only if each entry Ci of c 
is of the form 

d= I x 7(l) (l-:r)^ 7(l) dp(x) (4) 

i[0,l] 

for some normalized a -additive measure p on [0, 1]. 
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Proof: We will use methods from Nonstandard Analysis working in a suitable non- 
standard universe *V, see for example [4j. The key idea to the proof is to marginalize 
some on some infinite language to finite languages, rather than constructing exten- 
sions of some on a finite language to each finite level. Suppose we have such a ULi 
with IP family W of probability functions, so for each r E N, we have some w^ r > on L r 
in this family. By the Transfer Principle this holds for each r E *N, so we can pick some 
nonstandard natural number v E *N \ N and consider w^'. Now w^ v > \ SL r = for 
each r < is, as these are members of the same ULi family and we can retrieve our original 
family W by looking at functions of the form \ SL r for r E N, taking standard parts 
- denoted as usual by - where necessary. 

In more detail let *V be a nonstandard universe that contains at least D 2 >? for finite 
q E N, all probability functions satisfying Px and everything else needed in this 
proof. Let v E *N be nonstandard and consider b E ©2" such that on L u satisfies Px. 
Assume that B is the alternative notation for b given by (J2J). For each q < is, we can 
define a probability function on L q in *V satisfying Px by letting 

v-q+j / x 

for j — 0, ... , g. In general, this gives cE *0 2 <?, so we need to take the standard part of 
c, denoted °c, to get a probability function in V. 

We will first look at B when all weight is concentrated on a single B K , < k < v. Since 
we need to have X1k=o = 1, we obtain 



Then we get for < j < q 



-i 



K — J J \K — J J \K y 

{is — q)\ ■ k\ ■ (is — k) 



(K - j)l ■ (is - q - k + J)l ■ v\ 
_ k ■ (ac - 1) • • • (k - j + 1) • (is — k) ■ ■ ■ (is — k — q + j + 1) 
u-(v-l)---(is-q + l) 

thus leading to the standard part being 



(6) 



Now consider an arbitrary B = (Bq, . . . , B v ). Then for each < k < is there exists 
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7k £ *[0, 1] such that we can write 

-l 



Note that since 



K=0 

we must have 



SO- 1 



K=0 



Then using ([6]) we see that each summand in Cj will be of the form 

-l 

/ v — q \ i v \ 

Ik ■ 



v-q\iv 

K - J J V K 



thus °Cj will become 



Since we are only interested in the standard part, we can add the finitely many summands 
for k = 0, . . . , j — 1, v — q +j + 1, . . . , v without changing °Cj (assuming that < j < q), 
as we have 



'i>(i:-)(:)"> ( t * 



v — q\ I v 



K-JJ \K 



i~ l ° / / \ / \ -i\ ° / v / \ / \ -i N 

v — q\(v\ \ I ( v ~~ l\ ( v 



7k 



^K—lJXKl / \ 14 — ' \K—1J\K 

= + 0, 

because for /t G {0, . . . , j — 1, z/ — q + j + 1, . . . , u}, either °(k/v) = or °(1 — k/v) = 0, so 
the first and last sum vanish as each consists of finitely many terms. Note that in case 
j = 0, q, either the first or the second summand is empty, and therefore we can apply 
the same argument for j = 0, q as well, giving 



CM: 

\«=o v 



, , . -1 
v — q\ Iv 



•0= l>^-L_ L I w 
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for j G {0,...,g}. 



Now let N — {0, . . . , z/} and (in *V of course) let p be the Loeb counting measure on N 
(see example (1), section 2 in Then we can write ([9]) as 



i 

v — q\ [v 



° c ' = J*"\*-j)U Mk) - (10) 

Let p! be the discrete measure on *[0, 1] which for K G iV gives the point k/v measure 
7 K . Then we get 

v — q\ I v\ I I v — q \ I v 



Now let p be the measure in V on [0, 1] which for a Borel subset A of [0, 1] gives 

p(A) = V(M). (12) 
By well known results from Loeb Measure Theory, see for example jl], 

7 w to)-w^«-/ M '(to)-ur)^- 

Combining ([ZD , ([10]) , ([IT]) , (JT3D now gives that 

°Cj = [ x 3 ■ (1 -x) q - j dp{x) (14) 

J '[0,1] 

We obtain a c G D 29 by letting 

lop op op op op op \ 

As we can marginalize fe in the above way to any r G N, we obtain that given a family of 
functions {wj \ d r G D 2 r} such that each d r is obtained by marginalizing some b G 3 2 v 
and therefore satisfies (jlj), this family satisfies Unary Language Invariance. 

For the converse it is straightforward to check that any w $ for which all the in c are of 
the form f[T4T) does satisfy ULi, the required family member on L r being obtained simply 
by changing q to r with the same measure p. -| 



However, as the following example will show, the probability functions of the form 
satisfying ULi with IP are not the building blocks that generate all probability functions 
satisfying ULi: 
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Example 6. Let c 2 be the probability function on L2 given by 

C 2 = 4_1 ( W (l,0,0,0> + w (0,l,0,0> + ^ (0,0,1,0) + W (0,0,0,1)) • 

Then Cq 2 satisfies ULi as it is a member of Carnap's Continuum of Inductive Methods 
(see e.g. [E]). However, both (0, 1, 0, 0) and (0, 0, 1, 0) are not of the form (jlj), and thus 
Cq 2 shows that we cannot have a Representation Theorem for w satisfying ULi of the 
form 

w = w$ dfx(x) 
Jn 2 q 

with fj, giving all weight to c of the form (j3J). 



4 The Representation Theorem for w satisfying ULi 

In the previous section, we used a probability function satisfying Px + IP on the infinite 
language L v to construct a language invariant family by marginalizing to each finite 
level. 

In this section we shall instead derive a representation theorem for just ULi by using 
an arbitrary state description T of L u to construct a probability function satisfying Px 
by averaging over all permutations of predicates, similarly to the definition of Cq 2 in 
6A First Representation Theoremthm.6 

Let T(Pi, . . . , P u , ai, . . . , a u ) be the state description of L v given by 

V V 

T(P 1 ,...,P v ,a 1 ,...,a v ) = /\/\P^(a j ). 

1=1.7=1 

Then we can represent T by the v x v - matrix 

,1 £ 1,2 ■ ■ ■ &l,v 
£2,1 £2,2 ' ' ' &2,v 

\ £ u,l &v,2 ' ' ' &v,v 



(15) 



Now consider the q x v - matrix \1/ where the j'th row of is the i/th row of T, for 
some ii, . . . , i q £ {1, . . . , u}, not necessarily distinct. Then we can similarly think of \I/ 
as a state description \I/(ai, . . . , a v ) of L q . So each column of ^ represents an atom of 
L q , and we obtain c £ *D 2g by letting 

_ \{j\*\=<Xi(aj)}\ 

Ci — 

V 
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We thus obtain for each (ii, . . . , i q ) with 1 < ii, . . . , i q < v some W? for c G *B> 2 q, which 
we shall denote by wj- ■ x . 

We can now define the functions that we will then use to prove the representation 
theorem for general ULi functions. 

Definition 7: 

Let T(P 1; . . . , P u , Oi, . . . , a u ) be a state description of L v for v distinct constants. Let 
L = L q for some finite q. For ii, . . . , i q G {1, . . . , z/}, not necessarily distinct, let i k 

be given as above. 

Define the function on SL by 

e:{l,...,q}->{l,...,y} 



Instead of just marginalizing to the first q rows, as we did in the case of w?, now also 
averages over all permutations of the predicates. One can think of this as picking q rows 
from the matrix representing T with replacement to obtain the predicates Pi, . . . ,P q of 
L q . 

Before our next result we need to recall another principle, see [S], [T2] . 
The Weak Irrelevance Principle, WIP 

A probability function w on SL satisfies Weak Irrelevance if whenever (p G QFSL have 
no constants nor predicates in common then 

w($ A ip) = w($) ■ w(ip). 



Theorem 8. Let T(Pi, . . . , P u , ax, ... , a v ) be a state description of L v and let L = L q . 
Then the function "V^ is (can be extended to) a probability function on SL satisfying 
ULi + WIP. 

Proof: From the definition of it is obvious that °Vx is a probability function sat- 
isfying Ex. 

For Px, let a be a permutation of the predicates of L. Then we obtain 

°v£(<re) = 



^• W 5(l),...,e(,))(^0) 

e: {l,...,<z}->T 



e:{l,...,g}->T 
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since a permutes the predicates of L, 



E - 

eoa-i:{l,...,g}^T 



U '(eoa- 1 (l),...,eoa- 1 (g)>(®) 



S ^ ' U 'S'(l),..,e'(g))(0) 



e':{l,...,9}-fT 



°V£(9). 



To show that ULi holds, notice that for 0(ai, . . . , a n ) the state description 



we obtain on 



where 



We obtain 



°V7 +1 (6) 



9(ai, ...,a n ) = f\ a h .(a,j), 

3=1 



9(ai,...,o n ) = \/ A a S( a ^' 

ei,...,e„6{0,l} j=l 



E 

£i,...,e n e{o,i} 



V A«S 

.ei,...,e„e{0,l} J=l 



E 

£i,...,£ n e{o,i} 



S ^T W (e(l),...,e( ? +l)) 
e:{l,..., 9 +l}-s-{l,...^} 



V A 



>£i,...,e„e{0,l} j=l 



e:{l,...,g+l}-»-{l,...,i>} ei,...,e„e{0,l} 



W (e(l),..,e 



(ff+D> V A a ? 



>£l,...,£„G{0,l}i =1 



E 

e':{l,...,q}->{l,...,v} 



1 

Z/9 



E \ E 

£i,...,£„e{0,l} 



(e'(l),...,e'( 9 ),/(l)) 



V A«S 

.£i,...,£„e{o,i} j=i 
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where 




e'(i) if i G {l,...,q}, 
f(l) ifz = g + l. 



It now remains to show that 

Yl ~ Yl W Je'(l),...,e'(q)J(l))) [ V f\ = W Je'W,-,e'(q))( Q ) 

(16) 



f:{i}->{l,...,v} V E 1 ,...,e n 6{Q 1 l} Vei,...,e n e{0,l} j=l 



for arbitrary e' : {1, . . . , g} — > T. There are c6 *D2<?, d G *ID>2<j+ 1 such that 

t _ 

W {e'(l),...,e'(q)) ~ W Si 
W Je'(l),...,e'(q)J(l)) = W d' 

Given (3j an atom of L q+1 , there is a unique atom aij of L g and a unique £ G {0, 1} such 
that 

Pj = of. 

Thus, we can unambiguously write dj = c\ for these i, e. We then obtain 

(n \ n 

v /\ a h] )= e ik 
ei,...,e„6{0,l}j'=l / Ei,...,e n e{0,l} J'=l 

=n«+ c u- (it) 

Since by picking row /(l) as the g + l'st row we partition the occurrences of the atom aj 
of L q obtained by picking rows e'(l), . . . , e'(q) into occurrences of the atoms a] and a° of 
L q+ i, and this is the only way in which we obtain these atoms, we must have c° + c\ = Ci 
for each i G {1, . . . , 2 q }. Thus (JTTj) gives 

n n 
Y[( C h, + Chj) = = W (e'(l),...,e'(9)>( )- 

3=1 i=i 
The equation (Tl6l) now follows. 

It remains to show that Weak Irrelevance holds for °Vy- Let i?(ai, . . . , a m ), 
</j(a m +i, . . . , a m+n ) be state descriptions of L having no constant or predicates in com- 
mon. We can assume that ■& G QFSL 1 , <p G QFSL 2 , where L 1 flL 2 = and L 1 U L 2 = L. 
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Let CKj range over the atoms of L 1 , (3j over the atoms of L 2 . Then we obtain in L 1 and 
L 2 , respectively, 



i?(ai, . . . ,a m ) = /\a hi (di), 



i=i 



V 9 ( a m+i ; • • • , o m+ra ) — /\ (3 gj (a m+ j). 

3=1 

Suppose that L 1 = {Pi, . . . , P p }, L 2 = {P p+ i, . . . , P p + r }- Then we obtain in L 



and by ULi for °V^, 



&(a 1 ,...,a m ) = \J /\a hi {ai) A0 8 .(oi), 

l<si,...,s m <2 r i=l 
n 

V?(a m+ i, . . . ,a m+n ) = V A a tj( a m+j) A PgMm+j), 

l<tl,...,t„<2Pj=l 

oV7i 



°V?(tf) = °V L 



r | V A n/ ' A ; * 

l<Sl,...,Sm<2 r 1=1 



r i V A "o A ^ 

l<tl,...,t„<2Pj=l 



Now for i? A we obtain in L 
°V^(tf Ay) 



V 



V A a *< A &< A A a '. A ^ 



a<si,-,Sm<2' - !<*!,.. .,*„<2P V=l 



\j=l 



E E 

l<si,...,s m <2' 1 l<ti,...,t„<2P 



E 

e:{l,...,g}^{l,...,i/} 



1 

1/9 



(e(l),...,e(?)> 



A"/< A ; > A A n/ A a 



v«=l 
° r 



vi=i 



= E E 

l<si,-,s m <2 T l<ti,...,t„<2P 



E - 



9j 



W <e(l),..,e(g)> 



A "hi A ^ 



e:{l,...,g}-s-{l,...,i/} 
W Ie(l),...,em [ A % A 



\i=l 



.7 = 1 



(18) 
(19) 
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by IP for wj (1)> .„ i6(g)> , 

E 

77; 

E 



l<si,...,s m <2 r 



E ^ ' W J(l),...,e(g)> 

e:{l,...,g}-».{l,...,j/} \i=l 



e:{ 



i=i 

E ^ " «#(l),..,e< g )> ( A a % A ^J I 

\j=l / J / 



E °vqA^ A ^ 

vl<*l,...,Sm<2 r \i=l 

°V£(tf) • °v£(^), 



E °v^(A^ A ^ A 



vl<*l,...,tn<2P 



by (HID and ([19]). -| 

Theorem 9. Let w be a probability function on L = L q . Then w satisfies ULi if and 
only if there exists some normalized a-additive measure p such that 



w 



°V£ dp(T). 



(20) 



Proof: By 8The Representation Theorem for w satisfying ULithm.8 it is straightfor- 
ward to see that any w in the form (120 j) satisfies ULi, as it is a convex combination of 
ULi functions. 

For the other direction, suppose w satisfied ULi. Then there is an extension w v of w 
to L u and we obtain for 0(ai, . . . , a n ) a state description of L, 



w (e) = E wLv (^)^ 



(21) 



$(ai,...,a„) 

where $ ranges over the state descriptions of L u . For a state description 
T(Pi, . . . ,P u ,a lt . . .,a u ), let 

T = {T(P CT( i), . . . , P a (y), a T (i), . . . , a r(l/ ) | cr, r are permutations of {1, ... , 2/}}. 

Note that the sets T partition the set of state descriptions of L u . We can now write fT2"Tj) 



as 



(©) = EE^( $ ) 



T $ex 



E 



|{$GT|$|=0}| l 



ITI 
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as w Lv is clearly constant on T since it satisfies Px (and Ex). 
Now the ratio 

|{$ e Tjj h gjj 

is equal to the probability that by randomly picking distinct predicates P ix , . . . , P; and 
constants a^, . . . , ctj n , we have that 

T |= ae(a h ,...,a jn ), 

where a is (an initial segment of) the permutation of predicates of L v with o~(k) = i& 
for k G {1, ... , g}. 

Note that with our definition of V^, we allow the same row to be picked multiple 
times, so not all picks of rows represent a permutation of the predicates. Thus the 
difference between the probabilities given by and the above ratio is the difference 
between picking rows of T with and without replacement. However, since the probability 
of picking the same row twice is infinitesimal, it will disappear when taking standard 
parts. 

Thus we obtain 

• ( l{«€T|^9}| ) = , v , (e) 
Now taking fi to be the measure on the T given by w Lv , we obtain 

E H±±2^M^( V r) = / '<* £ W>W ). 

T J \ \ 

Taking standard parts, we obtain 

= y°v^ P (t), 

where p is the Loeb measure given by the nonstandard measure fi. -| 

Since °Vj satisfies WIP we obtain the following theorem. 

Theorem 10. The "V^ are the only functions satisfying ULi with WIP. 

Proof: We follow essentially the proof for the analogous theorem for Atom Exchange- 
ability, given in [13]. 
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Let w be a probability function satisfying ULi with WIP. Let d G QFSL. Extend w 
to w' on some language L' large enough so that we can permute the predicates and 
constants in *& to obtain *&' with no predicates nor constants in common with d. We can 
achieve this by picking w' on L' in the same ULi family as w, giving w' \ SL = w and 
guaranteeing WIP for w'. By Px for w' we then have w'{d) = w'{d'). Now we clearly 
obtain 

= 2{w'{'d A •&') — w'(i9) ■ 

= y°v^(i?Ai?')^(*)-2 y°v^w dnW' y°vi'(^)^) 

+ y°vi'(^A^)^) 
=yy ( o vt'^) 2 -2 o vi'^)- o vi'(^+ o vi'(^) 2 ) d^d^) 
= y y ( o vi'(^- o vi'(^)) 2 d/i(*)d/i($), 

using the Representation Theorem. Certainly, since the function under the integral is 
non-negative, there must be a measure 1 set such that °V4 is constant on this set for 
each $ G QFSL, giving w' = for any \I/ in this set. Since w' \ SL = w, i.e. 

iw = f 5"L, marginalizing w' to L yields w = °V^, as required. -| 



5 A General Representation Theorem 

In the case of Atom Exchangeability (Ax) (see e.g. [121 chapter 33]), we have a the- 
orem stating that each w satisfying Ax can be represented as a difference of scaled ULi 
functions with Ax. In this section, we will prove the analogous version for Px. For the 
remainder of this section we assume that L = L q for some g6N. 

Definition 11: 

Let c G ©2<j • Let £ be the set of all permutations of atoms of L that are induced by Px. 
Define the probability function on QFSL by 

yg(6(ai, . . . , an)) = -r— ^2 Wac(Q(ai, a n )) 
for state descriptions Q(a\, . . . , a n ) of L. 



Note that by definition, satisfies Px. By a straightforward argument we obtain the 
following variation on de Finetti's Theorem: 
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Theorem 12. Let w be a probability function on SL satisfying Px. Then there exists a 
normalized, o-additive measure \i on the Borel sets o/D 29 such that 



w I A a h .(aj) ) = Vs A oc h .(aj) d/i(c). (22) 
\j=i J \i=i / 

Conversely, given such a measure fi, the function w defined by fl22|) satisfies Px. 



The key to obtaining the desired General Representation Theorem will therefore involve 
finding a uniform representation of the building blocks ys in terms of a difference of ULi 
functions. The V^ functions used for this proof will have a specific characterization 
that deserves a slightly different notation. Since at this point, we will be working in 
the usual standard universe again, we will drop the standard part symbol ° from the 
notation and assume that all from now on are given in their standard form. 

Recalling the definition of note that for fixed e : {1, . . . , q} — > {1, . . . , u}, the function 
w Je(i) e(q)) * s gi ven by the q x v - matrix with the i'th row identical to the e(i)'th row of 
T. Also, since with wT^-, ,y. we also have all the wT,,^ <r( e (?))) ^ or a ran ging over 
the permutations of the predicates of L occurring in V^, we see that this function is a 
convex combination of functions of the form y?. 

We can now arrange to contain a copy of y^ for a given c G ©2'? a s follows: Let $ be 
the state description represented by the matrix 

ttl ■ • • Oil Q-2 ■ ■ ■ Cl2 ■ ■ ■ Ol2i ■ ■ ■ Oi2i 

where occurs [q • v\ times. Now let pi, . . . , p q > be such that Ylt=i P* = 1 and let 
T be the v x v - matrix containing [pi ■ v\ copies of the i'th row of $, for each i, and fill 
the remaining rows with arbitrary copies of rows from $. Then certainly contains a 
copy of y 5 . 

With this in mind, we can modify the notation of to 

for p = (pi, ... , p q ) to indicate that T contains only q distinct rows, occurring with the 
frequency given by p. We will write pV^^ to indicate that T arises from c G ©29 in this 
manner. 

We can represent pVy^ in terms of as follows. Let K = {n eW \ Y^l=i n i = l}i so 
n G K represents the choices of picking rows from T. Then we obtain the representation 

q 

^T(g) = E tlP?fa, • • -'^VSt' ( 23 ) 
fieK i=i 
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where results from picking rows according to n and (as standard) 

(n 1 +n 2 + ... + n q )\ _ ( q 
ni!n 2 ! . . .n q \ \ni, . . 



(ni,...,n ff )! 



n, 



Note that we need this multinomial coefficient here since pVx(c) * s m ^ ac ^ a sum °f w Si 
and although each of the Wg occurring in yg occurs, the normalizing constant exists only 
implicitly in pV^c)- With this notation in mind, we can prove the first step needed to 
show the desired theorem. 



Lemma 13. Let c G ©29 ■ 

satisfying ULi such that 



Then there exist A > and probability functions W\, W2 



Xwo. 



ye= (1 + X)wi 

Proof: Fix cG D29. As demonstrated in the discussion above, we can easily find Vx 
with yg occurring in it, amongst other instances of yg. Thus, the problem reduces to 
finding a way to remove all of these other instances via ULi functions. 

To this end, suppose that for each m e K we have pmV^^ such that T is the state 
description obtained from wg by the method discussed above. Then, since the represent- 
ations of the form (1231) of these functions only differ in the coefficients of the yg occurring 
we obtain the equation 



(24) 



( ^ 




/ 






— A ■ 


(mi, . 




V i > 




V 


J 



"" k . It suffices now to show 



where A is the K x .fT-matrix with entry (m, n) being Ylk=i Pm k 
that we can pick the p^ such that A is regular. For suppose this is the case. Then we 
obtain from ( 12~4"|) the equation 



(25) 





( [ ^ 






\ 


A' 1 


Pm ^T(c) 


-1 


(mi,. 






V ! J 






J 


,m<=K- 


Then for 


n = 


(1,1,.. 


. , 1) we obta 



ini, 



,n q y. 



} j bn,mPm^ 't(c) ~~ ~ ^ ] ^n.mPmV 



L 



mgff 



m€K 



and by collecting the functions with positive coefficients in the linear combination on 
the right-hand side, we obtain constants 7, A > 0, independent of c, such that§ 



— ^2 ^.™P™ V T(c) = l w l - Xw 2 , 



rn£K 



3 Note that we can safely assume A ^ 0, since if A = 0, then the yg in question would already satisfy 
ULi, and therefore already has the desired representation by the Representation Theorem for ULi. 
We also trivially have 7^0, since y? is a probability function for any c £ ©21- 
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with i«2 convex combinations of ULi functions. Since this gives the probability 
function y$, we must have 

1 = ya(T) = jwx(T) - Xw 2 (T) = 7 - A, 

and thus 7 = 1 + A. 

It remains to show that the can be chosen such that A is regular. For this, we will 
show the following by induction on j: 

Let 1 < i\ < i 2 < ■ ■ ■ < ij < r and let Ai^^.x be the j x j sub-matrix of A obtained 
by taking the i±, . . . , i/th rows and columns of A. Then there is a choice of the pm k , 
k = ii, . . . , ij such that A^...^ is regular. 

For j = 1, this is trivial. Suppose j = n + 1 for some n > 1 and consider Au x i.y For a 
given m G K, the polynomial Ylj=i X T 3 takes its maximum value on D 29 at Xj = rrij/q. 
Fix an enumeration of K. There exists rhi k = (m,i hj x, • • • , m i k ,q) such that 

for all j 7^ k. For if not, then 

n(^) *nfr) <nfr) 

for some j 7^ fc, and continuing in this way we arrive at a contradiction. 

By the inductive hypothesis, there exists a choice of the p^ s , s G {ii, . . . , ij} \ {ik} such 
that the sub-matrix -A^ 1) ,„^ fe _ 1) j fc+lt ... ) ^) is regular. Thinking of the p-rh ik ,s for the moment 
as unknowns we obtain for the determinant of A^...^.) an expression of the form 

det(A {iu ... tij) ) = 

±11^1:;: " det^,...,^,^,...,^) + £ . (±det(A)) , (26) 

s=i ie{i 1 ,...,i J }\{i fe } s=i 

(for some choices of ±) where the A t are the corresponding sub-matrices of A^...,*.). 
Now picking p,^ s = (mj fciS /g) 9 for large enough g > 0, the term 

np«ir,«' det ( A ( i i.-".H-i. < w-i....,ii>) 

8=1 

becomes the dominant term of (|2T)j) . giving that detpl^...^.)) ^ 0, as certainly 
nLiP^i'l > and det (^<w,-A-iA+i,-,ii>) by the inductive hypothesis. 
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Note that using this procedure we in general obtain p^ with entries not summing 
to 1. In that case, we can pick p'^ such that 

for each m G K. Then the matrix A 1 with entries Yil=i P'd^s * s re g u l ar j us t if A is, and 
the p'^ have the desired properties. -| 



Using this lemma, we can now prove the desired theorem. 

Theorem 14 (General Representation Theorem for w satisfying Px). Let w 

be a probability function on SL satisfying Px. Then there exist A > and probability 
functions Wi, w 2 satisfying ULi such that 

w — (1 + X)wx — Xw 2 . 

Proof: Let w be a probability function on SL satisfying Px. By the Representation 
Theorem for Px, we have that w has a representation 

w = y s dfi(c) (27) 



for some measure and by 13A General Representation Theoremthm.13 we have, for 



a fixed A > 0, a representation 

y s = (1 + X)w ls - Xw 2s 
for each cG D 2 <?. Now applying this to the representation fl2"T|) . we obtain 

w= (1 + X)wi s - Xw 2e dji(c) 

= / (1 + X)w ls d/j,(c) - / Xw 23 dfi(c) 

JOnq JOoQ 



(1 + X)wi — Xw 



2; 



for 

w 1 = wi s dfi(c), w 2 = w 2s dfx(c), 

J0 2 q J0 2 q 

as required. -| 
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6 Conclusion 



With 9The Representation Theorem for w satisfying ULithm.9 , we have shown that the 



building blocks for probability functions satisfying Unary Language Invariance all satisfy 
Weak Irrelevance, and that in fact these are the only ones that satisfy this principle. 
This is analogous to the situation with Atom Exchangeability, Ax, and its generalization 
to Polyadic Pure Inductive Logic, Spectrum Exchangeability, see [12]. This analogy also 
extends to the General Representation Theorem, stating that each probability function 
satisfying Px is a scaled difference of probability functions satisfying ULi (see [T4"]). 

Throughout this paper we have worked in the conventional Unary Pure Inductive Logic. 
Recently however there has been a rapid development of Polyadic Pure Inductive Logic 
(again see [12]) and we anticipate that the Representation Theorem for ULi functions 
can be extended to the polyadic case, using the same methods as demonstrated above. 
A classification for probability functions on polyadic languages satisfying Language In- 
variance would give rise to the question whether we can find a corresponding General 
Representation Theorem for the polyadic well. 
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